Valence extraction using EM selection and co-occurrence matrices
نویسنده
چکیده
This paper discusses two new procedures for extracting verb valences from raw texts, with an application to the Polish language. The first novel technique, the EM selection algorithm, performs unsupervised disambiguation of valence frame forests, obtained by applying a non-probabilistic deep grammar parser and some post-processing to the text. The second new idea concerns filtering of incorrect frames detected in the parsed text and is motivated by an observation that verbs which take similar arguments tend to have similar frames. This phenomenon is described in terms of newly introduced co-occurrence matrices. Using co-occurrence matrices, we split filtering into two steps. The list of valid arguments is first determined for each verb, whereas the pattern according to which the arguments are combined into frames is computed in the following stage. Our best extracted dictionary reaches an $F$-score of 45%, compared to an $F$-score of 39% for the standard frame-based BHT filtering.
منابع مشابه
Textural Feature Extraction and Classification of Mammogram Images using CCCM and PNN
This work presents and investigates the discriminatory capability of contourlet coefficient cooccurrence matrix features in the analysis of mammogram images and its classification. It has been revealed that contourlet transform has a remarkable potential for analysis of images representing smooth contours and fine geometrical structures, thus suitable for textural details. Initially the ROI (Re...
متن کاملبررسی اثر ناسازگاری ماتریس های واریانس- کواریانس در شاخص انتخاب
In selection index procedure, phenotype and genetic (co)variance matrices of traits are used for calculating different genetic parameters like index coefficients, index variance, genetic gain in selection goal and selection accuracy. Sometimes, it is possible that these matrices become inconsistent or they are not positive, nor definite. In the current study, for investigation of the effect of ...
متن کاملبررسی اثر ناسازگاری ماتریس های واریانس- کواریانس در شاخص انتخاب
In selection index procedure, phenotype and genetic (co)variance matrices of traits are used for calculating different genetic parameters like index coefficients, index variance, genetic gain in selection goal and selection accuracy. Sometimes, it is possible that these matrices become inconsistent or they are not positive, nor definite. In the current study, for investigation of the effect of ...
متن کاملImplementing Texture Feature Extraction Algorithms on FPGA
Faculty of Electrical Engineering, Mathematics and Computer Science CE-MS-2009-25 Feature extraction is a key function in various image processing applications. A feature is an image characteristic that can capture certain visual property of the image. Texture is an important feature of many image types, which is the pattern of information or arrangement of the structure found in a picture. Tex...
متن کاملStatistical Feature Selection for Image Texture Analysis
Texture is one of the visual features used in Content Based Image Retrieval (CBIR) to represent the contents of the image with respect to the characteristics brightness, color, shape, size, etc. Texture is a property that represents spatial distribution of an Image. Texture can be defined as a repetition of an element or pattern in a problem space. Texture analysis can be used for classificatio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Language Resources and Evaluation
دوره 43 شماره
صفحات -
تاریخ انتشار 2009